AITopics | proposition iii

Collaborating Authors

proposition iii

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection

Pan, Bin, Shen, Shiyu, Wang, Zongbin, Shi, Zhenwei, Xu, Xia

arXiv.org Artificial IntelligenceAug-26-2025

Domain generalization seeks to develop models trained on a limited set of source domains that are capable of generalizing effectively to unseen target domains. While the predominant approach leverages large-scale pre-trained vision models as initialization, recent studies have highlighted that full fine-tuning can compromise the intrinsic generalization capabilities of these models. To address this limitation, parameter-efficient adaptation strategies have emerged, wherein only a subset of model parameters is selectively fine-tuned, thereby balancing task adaptation with the preservation of generalization. Motivated by this paradigm, we introduce Joint Parameter Selection (JPS), a novel method that restricts updates to a small, sparse subset of parameters, thereby retaining and harnessing the generalization strength of pre-trained models. Theoretically, we establish a generalization error bound that explicitly accounts for the sparsity of parameter updates, thereby providing a principled justification for selective fine-tuning. Practically, we design a selection mechanism employing dual operators to identify and update parameters exhibiting consistent and significant gradients across all source domains. Extensive benchmark experiments demonstrate that JPS achieves superior performance compared to state-of-the-art domain generalization methods, substantiating both the efficiency and efficacy of the proposed approach.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.16976

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Revisiting Transformers with Insights from Image Filtering

Abdullaev, Laziz U., Tkachenko, Maksim, Nguyen, Tan M.

arXiv.org Artificial IntelligenceJun-13-2025

The self-attention mechanism, a cornerstone of Transformer-based state-of-the-art deep learning architectures, is largely heuristic-driven and fundamentally challenging to interpret. Establishing a robust theoretical foundation to explain its remarkable success and limitations has therefore become an increasingly prominent focus in recent research. Some notable directions have explored understanding self-attention through the lens of image denoising and nonparametric regression. While promising, existing frameworks still lack a deeper mechanistic interpretation of various architectural components that enhance self-attention, both in its original formulation and subsequent variants. In this work, we aim to advance this understanding by developing a unifying image processing framework, capable of explaining not only the self-attention computation itself but also the role of components such as positional encoding and residual connections, including numerous later variants. We also pinpoint potential distinctions between the two concepts building upon our framework, and make effort to close this gap. We introduce two independent architectural modifications within transformers. While our primary objective is interpretability, we empirically observe that image processing-inspired modifications can also lead to notably improved accuracy and robustness against data contamination and adversaries across language and vision tasks as well as better long sequence understanding.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.10371

Country:

Asia (0.28)
North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Decentralized Optimization in Time-Varying Networks with Arbitrary Delays

Ortega, Tomas, Jafarkhani, Hamid

arXiv.org Machine LearningMay-29-2024

We consider a decentralized optimization problem for networks affected by communication delays. Examples of such networks include collaborative machine learning, sensor networks, and multi-agent systems. To mimic communication delays, we add virtual non-computing nodes to the network, resulting in directed graphs. This motivates investigating decentralized optimization solutions on directed graphs. Existing solutions assume nodes know their out-degrees, resulting in limited applicability. To overcome this limitation, we introduce a novel gossip-based algorithm, called DT-GO, that does not need to know the out-degrees. The algorithm is applicable in general directed networks, for example networks with delays or limited acknowledgment capabilities. We derive convergence rates for both convex and non-convex objectives, showing that our algorithm achieves the same complexity order as centralized Stochastic Gradient Descent. In other words, the effects of the graph topology and delays are confined to higher-order terms. Additionally, we extend our analysis to accommodate time-varying network topologies. Numerical simulations are provided to support our theoretical findings.

algorithm, graph, node, (14 more...)

arXiv.org Machine Learning

2405.19513

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Time, Travel, and Energy in the Uniform Dispersion Problem

Amir, Michael, Bruckstein, Alfred M.

arXiv.org Artificial IntelligenceApr-30-2024

We investigate the algorithmic problem of uniformly dispersing a swarm of robots in an unknown, gridlike environment. In this setting, our goal is to comprehensively study the relationships between performance metrics and robot capabilities. We introduce a formal model comparing dispersion algorithms based on makespan, traveled distance, energy consumption, sensing, communication, and memory. Using this framework, we classify several uniform dispersion algorithms according to their capability requirements and performance. We prove that while makespan and travel can be minimized in all environments, energy cannot, as long as the swarm's sensing range is bounded. In contrast, we show that energy can be minimized even by simple, ``ant-like" robots in synchronous settings and asymptotically minimized in asynchronous settings, provided the environment is topologically simply connected. Our findings offer insights into fundamental limitations that arise when designing swarm robotics systems for exploring unknown environments, highlighting the impact of environment's topology on the feasibility of energy-efficient dispersion.

algorithm, robot, time step, (16 more...)

arXiv.org Artificial Intelligence

2404.19564

Country:

Asia > Middle East > Israel (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Decentralized Optimization in Networks with Arbitrary Delays

Ortega, Tomas, Jafarkhani, Hamid

arXiv.org Artificial IntelligenceJan-20-2024

We consider the problem of decentralized optimization in networks with communication delays. To accommodate delays, we need decentralized optimization algorithms that work on directed graphs. Existing approaches require nodes to know their out-degree to achieve convergence. We propose a novel gossip-based algorithm that circumvents this requirement, allowing decentralized optimization in networks with communication delays. We prove that our algorithm converges on non-convex objectives, with the same main complexity order term as centralized Stochastic Gradient Descent (SGD), and show that the graph topology and the delays only affect the higher order terms. We provide numerical simulations that illustrate our theoretical results.

converge, graph, node, (14 more...)

arXiv.org Artificial Intelligence

2401.11344

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

A Neural-Network-Based Convex Regularizer for Inverse Problems

Goujon, Alexis, Neumayer, Sebastian, Bohra, Pakshal, Ducotterd, Stanislas, Unser, Michael

arXiv.org Artificial IntelligenceAug-25-2023

The emergence of deep-learning-based methods to solve image-reconstruction problems has enabled a significant increase in reconstruction quality. Unfortunately, these new methods often lack reliability and explainability, and there is a growing interest to address these shortcomings while retaining the boost in performance. In this work, we tackle this issue by revisiting regularizers that are the sum of convex-ridge functions. The gradient of such regularizers is parameterized by a neural network that has a single hidden layer with increasing and learnable activation functions. This neural network is trained within a few minutes as a multistep Gaussian denoiser. The numerical experiments for denoising, CT, and MRI reconstruction show improvements over methods that offer similar reliability guarantees.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2211.12461

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Geometric Facts Underlying Algorithms of Robot Navigation for Tight Circumnavigation of Group Objects through Singular Inter-Object Gaps

Chernov, Valerii, Matveev, Alexey

arXiv.org Artificial IntelligenceMay-22-2023

An underactuated nonholonomic Dubins-vehicle-like robot with a lower-limited turning radius travels with a constant speed in a plane, which hosts unknown complex objects. The robot has to approach and then circumnavigate all objects, with maintaining a given distance to the currently nearest of them. So the ideal targeted path is the equidistant curve of the entire set of objects. The focus is on the case where this curve cannot be perfectly traced due to excessive contortions and singularities. So the objective shapes into that of automatically finding, approaching and repeatedly tracing an approximation of the equidistant curve that is the best among those trackable by the robot. The paper presents some geometric facts that are in demand in research on reactive tight circumnavigation of group objects in the delineated situation.

artificial intelligence, lem, robot, (18 more...)

arXiv.org Artificial Intelligence

2305.12913

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)

Add feedback

Distributionally Robust Optimization using Cost-Aware Ambiguity Sets

Schuurmans, Mathijs, Patrinos, Panagiotis

arXiv.org Machine LearningMay-16-2023

We present a novel framework for distributionally robust optimization (DRO), called cost-aware DRO (CADRO). The key idea of CADRO is to exploit the cost structure in the design of the ambiguity set to reduce conservatism. Particularly, the set specifically constrains the worst-case distribution along the direction in which the expected cost of an approximate solution increases most rapidly. We prove that CADRO provides both a high-confidence upper bound and a consistent estimator of the out-of-sample expected cost, and show empirically that it produces solutions that are substantially less conservative than existing DRO methods, while providing the same guarantees.

ambiguity, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2303.09408

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Accelerated Algorithms for a Class of Optimization Problems with Equality and Box Constraints

Parashar, Anjali, Srivastava, Priyank, Annaswamy, Anuradha M.

arXiv.org Artificial IntelligenceMay-7-2023

Convex optimization with equality and inequality constraints is a ubiquitous problem in several optimization and control problems in large-scale systems. Recently there has been a lot of interest in establishing accelerated convergence of the loss function. A class of high-order tuners was recently proposed in an effort to lead to accelerated convergence for the case when no constraints are present. In this paper, we propose a new high-order tuner that can accommodate the presence of equality constraints. In order to accommodate the underlying box constraints, time-varying gains are introduced in the high-order tuner which leverage convexity and ensure anytime feasibility of the constraints. Numerical examples are provided to support the theoretical derivations.

artificial intelligence, convex, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.04433

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Model Extraction Attacks Against Reinforcement Learning Based Controllers

Sajid, Momina, Shen, Yanning, Shoukry, Yasser

arXiv.org Artificial IntelligenceApr-25-2023

We introduce the problem of model-extraction attacks in cyber-physical systems in which an attacker attempts to estimate (or extract) the feedback controller of the system. Extracting (or estimating) the controller provides an unmatched edge to attackers since it allows them to predict the future control actions of the system and plan their attack accordingly. Hence, it is important to understand the ability of the attackers to perform such an attack. In this paper, we focus on the setting when a Deep Neural Network (DNN) controller is trained using Reinforcement Learning (RL) algorithms and is used to control a stochastic system. We play the role of the attacker that aims to estimate such an unknown DNN controller, and we propose a two-phase algorithm. In the first phase, also called the offline phase, the attacker uses side-channel information about the RL-reward function and the system dynamics to identify a set of candidate estimates of the unknown DNN. In the second phase, also called the online phase, the attacker observes the behavior of the unknown DNN and uses these observations to shortlist the set of final policy estimates. We provide theoretical analysis of the error between the unknown DNN and the estimated one. We also provide numerical results showing the effectiveness of the proposed algorithm.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2304.1309

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback